An unsupervised fuzzy ensemble algorithmic scheme for gene expression data analysis
نویسندگان
چکیده
Background: In recent years unsupervised ensemble clustering methods have been successfully applied to DNA microarray data analysis to improve the accuracy and the reliability of clustering results. Nevertheless, a major problem is represented by the fact that classes of functionally correlated examples (e.g. subclasses of diseases characterized at bio-molecular level) are not in general clearly separable, and in many cases the same gene may belong to different functional classes (e.g. may participate to different biological processes). Results: We propose an ensemble clustering algorithm scheme, based on a fuzzy approach, that directly permit to deal with overlapping classes or with genes or samples that may belong to more clusters at the same time. From our algorithmic scheme several fuzzy ensemble clustering algorithms may be derived, according to the way the multiple clusterings are combined and the consensus clustering is generated. We test some of the proposed ensemble algorithms with two DNA microarray data sets available on the web, comparing the results with other single and ensemble clustering methods. Conclusions: Our proposed fuzzy ensemble approach may be applied to discover classes of co-expressed genes or subclasses of functionally related examples, and in principle it may be
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملEnsemble clustering with a fuzzy approach
Ensemble clustering is a novel research field that extends to unsupervised learning the approach originally developed for classification and supervised learning problems. In particular ensemble clustering methods have been developed to improve the robustness and accuracy of clustering algorithms, as well as the ability to capture the structure of complex data. In many clustering applications an...
متن کاملComparison Between Unsupervised and Supervise Fuzzy Clustering Method in Interactive Mode to Obtain the Best Result for Extract Subtle Patterns from Seismic Facies Maps
Pattern recognition on seismic data is a useful technique for generating seismic facies maps that capture changes in the geological depositional setting. Seismic facies analysis can be performed using the supervised and unsupervised pattern recognition methods. Each of these methods has its own advantages and disadvantages. In this paper, we compared and evaluated the capability of two unsuperv...
متن کاملFuzzy Ensemble Clustering for DNA Microarray Data Analysis
Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that classes of examples or classes of functionally related genes are sometimes not clearly defined. To face these items, we propose a fuzzy ensemble clustering approach to both improve the accuracy of clustering resu...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کامل